component manifold
- North America > United States (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Manify: A Python Library for Learning Non-Euclidean Representations
Chlenski, Philippe, Du, Kaizhu, Satow, Dylan, Pe'er, Itsik
Leveraging manifold learning techniques, Manify provides tools for learning embeddings in (products of) non-Euclidean spaces, performing classification and regression with data that lives in such spaces, and estimating the curvature of a manifold. Manify aims to advance research and applications in machine learning by offering a comprehensive suite of tools for manifold-based data analysis.
- Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
- North America > United States > New York > New York County > New York City (0.04)
Mixed-curvature decision trees and random forests
Chlenski, Philippe, Chu, Quentin, Khan, Raiyan R., Moretti, Antonio Khalil, Pe'er, Itsik
Decision trees (DTs) and their random forest (RF) extensions are workhorses of classification and regression in Euclidean spaces. However, algorithms for learning in non-Euclidean spaces are still limited. We extend DT and RF algorithms to product manifolds: Cartesian products of several hyperbolic, hyperspherical, or Euclidean components. Such manifolds handle heterogeneous curvature while still factorizing neatly into simpler components, making them compelling embedding spaces for complex datasets. Our novel angular reformulation of DTs respects the geometry of the product manifold, yielding splits that are geodesically convex, maximum-margin, and composable. In the special cases of single-component manifolds, our method simplifies to its Euclidean or hyperbolic counterparts, or introduces hyperspherical DT algorithms, depending on the curvature. We benchmark our method on various classification, regression, and link prediction tasks on synthetic data, graph embeddings, mixed-curvature variational autoencoder latent spaces, and empirical data. Compared to six other classifiers, product DTs and RFs ranked first on 21 of 22 single-manifold benchmarks and 18 of 35 product manifold benchmarks, and placed in the top 2 on 53 of 57 benchmarks overall. This highlights the value of product DTs and RFs as straightforward yet powerful new tools for data analysis in product manifolds. Code for our paper is available at https://github.com/pchlenski/embedders.
- North America > United States > New York > New York County > New York City (0.14)
- North America > United States > Montana (0.04)
- Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
- (5 more...)
Mixed-Curvature Decision Trees and Random Forests
Chlenski, Philippe, Chu, Quentin, Pe'er, Itsik
We extend decision tree and random forest algorithms to mixed-curvature product spaces. Such spaces, defined as Cartesian products of Euclidean, hyperspherical, and hyperbolic manifolds, can often embed points from pairwise distances with much lower distortion than in single manifolds. To date, all classifiers for product spaces fit a single linear decision boundary, and no regressor has been described. Our method overcomes these limitations by enabling simple, expressive classification and regression in product manifolds. We demonstrate the superior accuracy of our tool compared to Euclidean methods operating in the ambient space for component manifolds covering a wide range of curvatures, as well as on a selection of product manifolds.
- Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Fine-grained Optimization of Deep Neural Networks
In recent studies, several asymptotic upper bounds on generalization errors on deep neural networks (DNNs) are theoretically derived. These bounds are functions of several norms of weights of the DNNs, such as the Frobenius and spectral norms, and they are computed for weights grouped according to either input and output channels of the DNNs. In this work, we conjecture that if we can impose multiple constraints on weights of DNNs to upper bound the norms of the weights, and train the DNNs with these weights, then we can attain empirical generalization errors closer to the derived theoretical bounds, and improve accuracy of the DNNs. To this end, we pose two problems. First, we aim to obtain weights whose different norms are all upper bounded by a constant number, e.g. 1.0. To achieve these bounds, we propose a two-stage renormalization procedure; (i) normalization of weights according to different norms used in the bounds, and (ii) reparameterization of the normalized weights to set a constant and finite upper bound of their norms. In the second problem, we consider training DNNs with these renormalized weights. To this end, we first propose a strategy to construct joint spaces (manifolds) of weights according to different constraints in DNNs. Next, we propose a fine-grained SGD algorithm (FG-SGD) for optimization on the weight manifolds to train DNNs with assurance of convergence to minima. Experimental results show that image classification accuracy of baseline DNNs can be boosted using FG-SGD on collections of manifolds identified by multiple constraints.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- (7 more...)
Signal Recovery on Incoherent Manifolds
Hegde, Chinmay, Baraniuk, Richard G.
Suppose that we observe noisy linear measurements of an unknown signal that can be modeled as the sum of two component signals, each of which arises from a nonlinear sub-manifold of a high dimensional ambient space. We introduce SPIN, a first order projected gradient method to recover the signal components. Despite the nonconvex nature of the recovery problem and the possibility of underdetermined measurements, SPIN provably recovers the signal components, provided that the signal manifolds are incoherent and that the measurement operator satisfies a certain restricted isometry property. SPIN significantly extends the scope of current recovery models and algorithms for low dimensional linear inverse problems and matches (or exceeds) the current state of the art in terms of performance.
- North America > United States > Texas (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > Connecticut > New Haven County > New Haven (0.04)
- (5 more...)
High Dimensional Data Fusion via Joint Manifold Learning
Davenport, Mark A. (Stanford University) | Hegde, Chinmay (Rice University) | Duarte, Marco F. (Princeton University) | Baraniuk, Richard G. (Rice University)
The emergence of low-cost sensing architectures for diverse modalities has made it possible to deploy sensor networks that acquire large amounts of very high-dimensional data. To cope with such a data deluge, manifold models are often developed that provide a powerful theoretical and algorithmic framework for capturing the intrinsic structure of data governed by a low-dimensional set of parameters. However, these models do not typically take into account dependencies among multiple sensors. We thus propose a new joint manifold framework for data ensembles that exploits such dependencies. We show that joint manifold structure can lead to improved performance for manifold learning. Additionally, we leverage recent results concerning random projections of manifolds to formulate a universal, network-scalable dimensionality reduction scheme that efficiently fuses the data from all sensors.
- North America > United States > Texas > Harris County > Houston (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (3 more...)
MiPPS: A Generative Model for Multi-Manifold Clustering
Koyejo, Oluwasanmi (University of Texas at Austin) | Ghosh, Joydeep (University of Texas at Austin)
We propose a generative model for high dimensional data consisting of intrinsically low dimensional clusters that are noisily sampled. The proposed model is a mixture of probabilistic principal surfaces (MiPPS) optimized using expectation maximization. We use a Bayesian prior on the model parameters to maximize the corresponding marginal likelihood. We also show empirically that this optimization can be biased towards a good local optimum by using our prior intuition to guide the initialization phase The proposed unsupervised algorithm naturally handles cases where the data lies on multiple connected components of a single manifold and where the component manifolds intersect. In addition to clustering, we learn a functional model for the underlying structure of each component cluster as a parameterized hyper-surface in ambient noise.This model is used to learn a global embedding that we use for visualization of the entire dataset. We demonstrate the performance of MiPPS in separating and visualizing land cover types in a hyperspectral dataset.
- North America > United States > Texas > Travis County > Austin (0.04)
- Africa > Botswana (0.04)